A novel algorithm for enhancing search results by detecting dissimilar patterns based on correlation method

نویسندگان

  • Poonkuzhali Sugumaran
  • Kishore Ravi
  • Thirumurugan Shanmugam
چکیده

The dynamic collection and voluminous growth of information on the web poses great challenges for retrieving relevant information. Though most of the researchers focused their research work in the areas of information retrieval and web mining, still their focus is only on retrieving similar patterns leaving dissimilar patterns which are likely to contain the outlying data. So this paper concentrates on mining web content outliers which extract the dissimilar web documents taken from the group of documents of same domain. Mining web content outliers indirectly help in promoting business activities and improving the quality of the search results. Existing algorithms for web content outliers mining are developed for structured documents, whereas, World Wide Web (WWW) contains mostly unstructured and semi structured documents. Therefore, there is need to develop a technique to mine outliers for unstructured and semi structured document types. In this research work, a novel statistical approach based on correlation method is developed for retrieving relevant web document through outlier detection technique. In addition, this method also identifies the redundant web documents. Removal of both redundant and outlaid documents improves the quality of search results catering to the user needs. Evaluation of the correlation method using Normalized Discounted Cumulative Gain method (NDCG) gives search results above 90%. The experimental results proved that this methodology gives better results in terms of accuracy, recall and specificity than the existing methodologies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

FUZZY GRAVITATIONAL SEARCH ALGORITHM AN APPROACH FOR DATA MINING

The concept of intelligently controlling the search process of gravitational search algorithm (GSA) is introduced to develop a novel data mining technique. The proposed method is called fuzzy GSA miner (FGSA-miner). At first a fuzzy controller is designed for adaptively controlling the gravitational coefficient and the number of effective objects, as two important parameters which play major ro...

متن کامل

A Novel Fuzzy Based Method for Heart Rate Variability Prediction

Abstract In this paper, a novel technique based on fuzzy method is presented for chaotic nonlinear time series prediction. Fuzzy approach with the gradient learning algorithm and methods constitutes the main components of this method. This learning process in this method is similar to conventional gradient descent learning process, except that the input patterns and parameters are stored in mem...

متن کامل

A New Two-Stage Method for Damage Identification in Linear-Shaped Structures Via Grey System Theory and Optimization Algorithm

The main objective of this paper is concentrated on presenting a new two-stage method for damage localization and quantification in the linear-shaped structures. A linear-shaped structure is defined as a structure in which all elements are arranged only on a straight line. At the first stage, by employing Grey System Theory (GST) and diagonal members of the Generalized Flexibility Matrix (GFM),...

متن کامل

Objective Peak-Detection in Complex Auditory Brainstem Response to /ba/, /da/, /ga/: A Novel Technique

Objectives: The result of auditory brainstem response is used worldwide for detecting hearing impairments or hearing aids. This study aimed to introduce the superiority of mathematical innovation algorithm toward subjective evaluation by an audiologist. The automatic algorithm method is encouraged for detecting the waves of Auditory Brainstem Response (ABR), because it can reduce subjective eva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2017